A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
نویسندگان
چکیده
In this paper, we propose a novel system for translating organization names from Chinese to English with the assistance of web resources. Firstly, we adopt a chunkingbased segmentation method to improve the segmentation of Chinese organization names which is plagued by the OOV problem. Then a heuristic query construction method is employed to construct an efficient query which can be used to search the bilingual Web pages containing translation equivalents. Finally, we align the Chinese organization name with English sentences using the asymmetric alignment method to find the best English fragment as the translation equivalent. The experimental results show that the proposed method outperforms the baseline statistical machine translation system by 30.42%.
منابع مشابه
A Practical Chinese-English ON Translation Method Based on ON's Distribution Characteristics on the Web
In this paper, we present a demo that translate Chinese-English organization name based on the input organization name’s distribution characteristics on the web. Specifically, we first experimentally validate two assumptions that are often used in organization name translation using web resources. From experimental results, we find out several distribution characteristics of Chinese organizatio...
متن کاملA Practical Chinese - English Organization Name Translation Method Based on Web Assistant ⋆
In those traditional organization name translation methods, researchers usually assumed that for every organization name to be translated, its correct translation would exist somewhere on the web. And some researchers further assumed that both the organization names to be translated and their correct translations would exist somewhere on some mix-language web pages. Thus these researchers think...
متن کاملEngkoo: Mining the Web for Language Learning
This paper presents Engkoo 1, a system for exploring and learning language. It is built primarily by mining translation knowledge from billions of web pages using the Internet to catch language in motion. Currently Engkoo is built for Chinese users who are learning English; however the technology itself is language independent and can be extended in the future. At a system level, Engkoo is an a...
متن کاملChinese-English Organization Name Translation Based on Correlative Expansion
This paper presents an approach to translating Chinese organization names into English based on correlative expansion. Firstly, some candidate translations are generated by using statistical translation method. And several correlative named entities for the input are retrieved from a correlative named entity list. Secondly, three kinds of expansion methods are used to generate some expanded que...
متن کاملExploiting the Web as Parallel Corpora for Cross- Language Information Retrieval
The expansion of the Web creates more requirements for Cross-Language Information Retrieval (CLIR). Query translation is the key problem. Previous studies have shown that query translation can be done by exploiting a large set of parallel texts. However, the problem arisen is the unavailability of large parallel corpora for many languages. In this paper, we describe a mining system that automat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009